At our last meeting, I use the error defined below to find the ‘best’ estimate: \[(\sum(X11_{trend}-SSM_{trend})^2) * (\sum(X11_{seasonal}-SSM_{seasonal})^2) * (\sum(X11_{seasadj}-SSM_{seasadj})^2)\]
And the setting of variance is \((0.001,0.01,0.1,1,10,100)\).
After some exploration, I found one point I didn’t find before:
Exploration means: I tried two error measurements we mentioned last time, which are
\({\sum(X11_{trend}-SSM_{trend})^2} + {\sum(X11_{seasonal}-SSM_{seasonal})^2} + {\sum(X11_{seasadj}-SSM_{seasadj})^2}\) ;
\(\frac{\sum(X11_{irregular}-SSM_{irregular})^2}{\sigma_I^2} + \frac{\sum(X11_{trend}-SSM_{trend})^2}{\sigma_T^2} + \frac{\sum(X11_{seasonal}-SSM_{seasonal})^2}{\sigma_S^2}\) .
Based on the point I mentioned above, the second measurement doesn’t work anymore because:
\((1,1,1)\) and \((100,100,100)\) actually give us the same result, but this measurement will prefer to choose the latter;
\((100,100,100)\) could be the ‘best’ choice cause these numbers are denominators, but they are not in fact.
I choose the first measurement mentioned above;
I change the setting of variances to 1:10 and 1:20 separately.
And the best fits are \((10,7,4)\) and \(18,12,7\) separately.
Related Code is defined below:
# define our standard to choose parameters
# note this is for additive models
Dif <- function(x11, ssm, data){
x11_trend <- series(x11, 'd12')
x11_seasonal <- series(x11, 'd10')
x11_seasadj <- series(x11, 'd11')
ssm_trend <- coef(ssm, states = 'trend')
ssm_seasonal <- -rowSums(coef(ssm, states='seasonal'))
ssm_seasadj <- data[-1] - ssm_seasonal[-length(data)] # length is shorter
D <- sum((x11_trend-ssm_trend)^2) +
sum((x11_seasonal[-1]-ssm_seasonal[-length(data)])^2) +
sum((x11_seasadj[-1]-ssm_seasadj)^2)
return(D)
}
# define the search function
exhaustion_version2 <- function(data){
Difference <- c()
index <- c()
x11 <- seas(data, x11='')
for (i in 1:10) {
for (j in 1:10) {
for (k in 1:10) {
ssmm <- SSModel(data ~ SSMtrend(1, Q=list(j)) +
SSMseasonal(12, sea.type = 'dummy', Q = k),
H = i)
ssm <- KFS(ssmm)
dif <- Dif(x11, ssm, data)
Difference <- c(Difference, dif)
index <- rbind(index, c(i, j, k))
}
}
}
df <- data.frame(variance=index, difference = Difference)
return(df)
}
unemp_exhaustionversion2 <- exhaustion_version2(unemp)
unemp_exhaustionversion2[which.min(unemp_exhaustionversion2$difference),]
## variance.1 variance.2 variance.3 difference
## 964 10 7 4 2937106
Note: need to change 10 to 20 in function exhaustion_version2 manually.
At the last meeting, my result is composed by NaNs and 0s(same in the MAP estimate).
I switch to loglikelihood after our meeting, and the result is good!
According to my result, the maximum likelihood estimate is \((10,10,4)\) which is different from the best fit \((10,7,4)\) we gained before.
The code to have this result is below: (variance setting is 1:10)
# define the log likelihood function
# because of seasonal components' characteristic
# I didn't compute the loglikelihood at first 11 points
loglikelihood <- function(data, trend, season, sigma){
n <- length(data)
a <- 0
for (i in 12:n) a <- a + (sum(season[(i-11):i]))^2
l <- -(n-11)/2 * log(sigma[1]) -
(n-11)/2 * log(sigma[2]) - (n-11)/2 * log(sigma[3]) -
sum((data[-c(1:11)]-trend[-c(1:11)]-season[-c(1:10,n)])^2)/(2*sigma[1]) -
sum((trend[-c(1:11)]-trend[-c(1:10,n)])^2)/(2*sigma[2]) -
a / (2*sigma[3])
return(l)
}
# define the log likelihood matrix
loglikelihood_matrix <- function(data){
LL <- c()
index <- c()
for (i in 1:10) {
for (j in 1:10) {
for (k in 1:10) {
ssmm <- SSModel(data ~ SSMtrend(1, Q=list(j)) +
SSMseasonal(12, sea.type = 'dummy', Q = k),
H = i)
ssm <- KFS(ssmm)
ssm_trend <- coef(ssm, states = 'trend')
ssm_seasonal <- -rowSums(coef(ssm, states='seasonal'))
# ssm_seasadj <- data[-1] - ssm_seasonal[-length(data)] # length is shorter
sigma <- c(i, j, k)
ll <- loglikelihood(data, ssm_trend, ssm_seasonal, sigma)
LL <- c(LL, ll)
index <- rbind(index, sigma)
}
}
}
df <- data.frame(variance=index, loglikelihood=LL)
return(df)
}
unemp_loglikelihoodmatrix <- loglikelihood_matrix(unemp)
unemp_loglikelihoodmatrix[which.max(unemp_loglikelihoodmatrix$loglikelihood),]
## variance.1 variance.2 variance.3 loglikelihood
## sigma.993 10 10 4 -476017.3
I spent quite a long time trying to plot the matrix I received above. The main difficulties are:
it is a 4-D problem;
if we negotiate to plot 3-D data(fix one variance at a time), our variance is discreat, which means we could have scatterplot without problem, but how to connect points to a surface? I did NOT find the solution.
I have some results for 3-D cases, but since they are scatterplot, the interpretation is not very good.
Remember: the mle is (10,10,4).
I will fix \(\sigma_y^2\) at 1:10, then plot the other two variance with loglikelihood successively:
Based on these results, the intuitive conclusion may be:
with \(\sigma_y^2\) and \(\sigma_T^2\) increasing, (log)likelihood will increase;
with \(\sigma_S^2\) increasing, (log)likelihood will decrease.
Here, I fix \(\sigma_y^2\) at 10, and plot the corresponding loglikelihood w.r.t. \(\sigma_S^2\) when \(\sigma_T^2\) changes from 1 to 10:
Plot them separately:
Here, I choose the gamma distribution as the prior for single parameters, and have the following assumptions/settings:
Each parameter has different prior distribution;
Our prior is defined as the product of three gamma distributions;
The ratio among \(\sigma_y^2,\sigma_T^2,\sigma_S^2\) is 10:5:1, thus the expectations of three gamma dist’ns are 10, 5 and 1;
Shape parameters in three gamma distributions are taken from \(2,2^2, 2^3,2^4,2^5\), that is to say we compute the map estimate 125 times;
Most important: we take LOG here.
The code is defined below:
logPrior_gamma <- function(sigma, alpha, beta){
(alpha[1]-1) * log(sigma[1]) - sigma[1]/beta[1] +
(alpha[2]-1) * log(sigma[2]) - sigma[2]/beta[2] +
(alpha[3]-1) * log(sigma[3]) - sigma[3]/beta[3]
}
logposterior <- function(data, trend, season, sigma, alpha, beta){
loglikelihood(data,trend,season,sigma) + logPrior_gamma(sigma, alpha, beta)
}
logposterior_matrix <- function(data, alpha, beta){
LP <- c()
index <- c()
for (i in 1:10) {
for (j in 1:10) {
for (k in 1:10) {
ssmm <- SSModel(data ~ SSMtrend(1, Q=list(j)) +
SSMseasonal(12, sea.type = 'dummy', Q = k),
H = i)
ssm <- KFS(ssmm)
ssm_trend <- coef(ssm, states = 'trend')
ssm_seasonal <- -rowSums(coef(ssm, states='seasonal'))
# ssm_seasadj <- data[-1] - ssm_seasonal[-length(data)] # length is shorter
sigma <- c(i, j, k)
lp <- logposterior(data, ssm_trend, ssm_seasonal, sigma, alpha, beta)
LP <- c(LP, lp)
index <- rbind(index, sigma)
}
}
}
df <- data.frame(variance=index, logposterior=LP)
return(df)
}
Since the time is quite long, I will import the final dataset directly:
head(unemp_LP_gammamatrix)
## X variance.1 variance.2 variance.3 logposterior X.1 X.2 X.3
## 1 unemp_LP_gamma 10 10 4 -476025.4 2 2 2
## 2 unemp_LP_gamma 10 10 4 -476030.6 2 2 4
## 3 unemp_LP_gamma 10 10 4 -476041.0 2 2 8
## 4 unemp_LP_gamma 10 10 4 -476061.9 2 2 16
## 5 unemp_LP_gamma 10 10 4 -476103.8 2 2 32
## 6 unemp_LP_gamma 10 10 4 -476024.7 2 4 2
## X.4 X.5 X.6
## 1 5 2.50 0.50000
## 2 5 2.50 0.25000
## 3 5 2.50 0.12500
## 4 5 2.50 0.06250
## 5 5 2.50 0.03125
## 6 5 1.25 0.50000
And the result shows the map value is always \((10,10,4)\) under our setting of priors.